19 research outputs found

    Sound regular expression semantics for dynamic symbolic execution of JavaScript

    Get PDF
    Existing support for regular expressions in automated test generation or verification tools is lacking. Common aspects of regular expression engines found in mainstream programming languages, such as backreferences or greedy matching, are commonly ignored or imprecisely approximated, leading to poor test coverage or failed proofs. In this paper, we present the first complete strategy to faithfully reason about regular expressions in the context of symbolic execution, focusing on the operators found in JavaScript. We model regular expression operations using string constraints and classical regular expressions and use a refinement scheme to address the problem of matching precedence and greediness. Our survey of over 400,000 JavaScript packages from the NPM software repository shows that one fifth make use of complex regular expressions features. We implemented our model in a dynamic symbolic execution engine for JavaScript and evaluated it on over 1,000 Node.js packages containing regular expressions, demonstrating that the strategy is effective and can increase line coverage of programs by up to 30%Comment: This arXiv version (v4) contains fixes for some typographical errors of the PLDI'19 version (the numbering of indices in Section 4.1 and the example in Section 4.3

    On Pattern Expression Languages

    No full text
    Abstract. In this paper we show that the family of pattern expression languages is closed under the intersection with regular languages. Since this family is not closed under complement but is closed under reverse, a natural question arises, that is, whether particular languages such as those containing words of type ww R are pattern expression languages or not. We give a proof for a negative answer to this question, and we provide several examples of languages which can not be specified by pattern expressions.

    Two Extensions of Cover Automata

    No full text
    Deterministic Finite Cover Automata (DFCA) are compact representations of finite languages. Deterministic Finite Automata with “do not care” symbols and Multiple Entry Deterministic Finite Automata are both compact representations of regular languages. This paper studies the benefits of combining these representations to get even more compact representations of finite languages. DFCAs are extended by accepting either “do not care” symbols or considering multiple entry DFCAs. We study for each of the two models the existence of the minimization or simplification algorithms and their computational complexity, the state complexity of these representations compared with other representations of the same language, and the bounds for state complexity in case we perform a representation transformation. Minimization for both models proves to be NP-hard. A method is presented to transform minimization algorithms for deterministic automata into simplification algorithms applicable to these extended models. DFCAs with “do not care” symbols prove to have comparable state complexity as Nondeterministic Finite Cover Automata. Furthermore, for multiple entry DFCAs, we can have a tight estimate of the state complexity of the transformation into equivalent DFCA

    Descriptional complexity in encoded blum static complexity spaces

    No full text
    Algorithmic Information Theory is based on the notion of descriptional complexity known as Chaitin-Kolmogorov complexity, defined in the '60s in terms of minimal description length. Blum Static Complexity spaces defined using Blum axioms, and Encoded Function spaces defined using properties of the complexity function, were introduced in 2012 to generalize the concept of descriptional complexity. In formal language theory we also use the concept of descriptional complexity for the number of states, or the number of transitions in a minimal finite automaton accepting a regular language, and apparently, this number has no connection to the general case of descriptional complexity. In this paper we prove that all the definitions of descriptional complexity, including complexity of operations, can be defined within the framework of Encoded Blum Static Complexity spaces, which extend both Blum Static Complexity spaces and Encoded Function spaces

    Preface

    No full text

    State complexity of the subword closure operation with applications to DNA coding

    No full text
    We are interested in the state complexity of languages that are defined via the subword closure operation. The subword closure of a set S of fixed-length words is the set of all words w for which any subword of w of the fixed length is in S. This type of constraint appears to be useful in various situations related to data encodings and in particular to DNA encodings. We present a few results related to this concept. In particular we give a general upper bound on the state complexity of a subword closed language and show that this bound is tight infinitely often. We also discuss the state complexity of DNA computing related cases of the subword closure operation
    corecore